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PROJECT  SUMMARIES 


Below  are  summarized  five  of  the  several  areas  on  which  significant  progress  has  been  made. 

1.  Autonomous  Learning,  Pattern  Recognition,  and  Prediction  [Articles  9,  10, 
and  11] 

An  important  open  problem  in  applied  mathematics  and  technology  is  to  design  au¬ 
tonomous  systems  capable  of  learning  to  recognize  and  predict  nonstationary  data  in  which 
mixtures  of  rare,  frequent,  and  unexpected  events  may  occur.  In  order  to  cope  with  rare 
events,  fast  learning  is  needed.  Fast  learning  can,  however,  destabilize  many  learning 
schemes.  In  order  to  cope  with  nonstationary  combinations  of  rare  and  frequent  events, 
different  degrees  of  generalization,  or  code  compression,  must  be  learnable  by  a  single  sys¬ 
tem.  Many  learning  schemes  cannot  simultaneously  operate  at  multiple  scales  of  coarseness. 
In  order  to  rapidly  learn  different  predictions  in  response  to  rare  events  than  to  a  cloud  of 
similar  frequent  events  in  which  they  are  embedded,  predictive  feedback  about  success  or 
failure  needs  to  operate  in  real-time  using  only  local  operations  to  separate  the  rare  exem¬ 
plar  from  the  frequent  cloud.  Many  learning  schemes  that  use  predictive  feedback  can  only 
operate  in  an  off-line  mode,  or  need  to  use  slow  learning,  or  are  computed  using  non-local 
operations. 

The  present  work  introduces  a  new  class  of  real-time  neural  networks  that  overcome  all  of 
these  problems.  These  neural  networks  are  defined  by  high-dimensional  nonlinear  dynamical 
systems  that  operate  at  multiple  time  scales.  They  are  designed  to  carry  out  fast,  stable,  au¬ 
tonomous  learning  of  recognition  codes  and  multidimensional  maps  in  response  to  arbitrary 
sequences  of  input  patterns.  In  order  to  learn  quickly  and  stably  in  response  to  a  nonstation¬ 
ary  input  stream,  the  networks  incorporate  operations  that  were  derived  from  an  analysis 
of  human  cognition,  and  that  have  been  used  to  explain  and  predict  many  behavioral  and 
neural  data.  These  operations  include  the  learning  of  abstractions  and  expectations,  paying 
attention,  hypothesis  testing,  memory  search,  novelty  detection,  and  confidence.  Dynamical 
systems  that  embody  these  operations  are  often  called  Adaptive  Resonance  Theory,  or  ART, 
networks  because  such  a  network  enters  a  resonant  state  when  it  pays  attention  to  data 
about,  which  it  will  learn 

Tue  ucw  lieui'di  network  archiiecture,  called  ARTmaP,  autonomously  learns  io  classily 
arbitrarily  many,  arbitrarily  ordered  vectors  into  recognition  categories  based  on  predictive 
success.  This  supervised  learning  system  is  built  up  from  a  pair  of  ART  modules  (ARTa 
and  ARTj)  that  are  capable  of  self-organizing  stable  recognition  categories  in  response  to 
arbitrary  sequences  of  input  patterns.  During  training  trials,  the  ART0  module  receives  a 
stream  {aW}  of  input  patterns,  and  ARTj  receives  a  stream  {bbO}  of  input  patterns,  where 
is  the  correct  prediction  given  abO.  These  ART  modules  are  linked  by  an  associative 
learning  network  and  an  internal  controller  that  ensures  autonomous  system  operation  in 
real  time.  During  test  trials,  the  remaining  patterns  ab>)  are  presented  without  bbO,  and 
their  predictions  at  ARTfc  are  compared  with  bb>).  Tested  on  a  benchmark  machine  learning 
database  in  both  on-line  and  off-line  simulations,  the  ARTMAP  system  learns  orders  of 
magnitude  more  quickly,  efficiently,  and  accurately  than  alternative  algorithms,  and  achieves 
100%  accuracy  after  training  on  less  than  half  the  input  patterns  in  the  database. 

ARTMAP  achieves  these  properties  by  using  an  internal  controller  that  realizes  a  new 
Minimax  Learning  Rule,  which  conjointly  maximizes  predictive  generalization  and  minimizes 
predictive  error  by  linking  predictive  success  to  category  size  on  a  trial-by  trial-basis,  using 
only  local  operations.  This  computation  increases  the  vigilance  parameter  pa  of  ART„  by 
the  minimal  amount  needed  to  correct  a  predictive  error  at  ARTj.  Parameter  pa  calibrates 
the  minimum  confidence  that  ARTa  must  have  in  a  category,  or  hypothesis,  activated  by  an 
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input  a(p)  in  order  for  ARTa  to  accept  that  category,  rather  than  search  for  a  better  one 
through  an  automatically  controlled  process  of  hypothesis  testing.  Parameter  pa  is  compared 
with  the  degree  of  match  between  a(p)  and  the  top-down  learned  expectation,  or  prototype, 
that  is  read-out  subsequent  to  activation  of  an  ARTa  category.  Search  occurs  if  the  degree 
of  match  is  less  than  pa.  ARTMAP  is  thus  a  type  of  self-organizing  expert  system  that 
calibrates  the  selectivity  of  its  hypotheses  based  upon  predictive  success.  As  a  result,  rare 
but  important  events  can  be  quickly  and  sharply  distinguished  even  if  they  are  similar  to 
frequent  events  with  different  consequences. 

Between  input  trials  pa  relaxes  to  a  baseline  vigilance  JH.  When  is  large,  the  system 
runs  in  a  conservative  mode,  wherein  predictions  are  made  only  if  the  system  is  confident  of 
the  outcome.  Very  few  false-alarm  errors  then  occur  at  any  stage  of  learning,  yet  the  system 
reaches  asymptote  with  no  loss  of  speed.  Because  ARTMAP  learning  is  self-stabilizing,  it 
can  continue  learning  one  or  more  databases,  without  degrading  its  corpus  of  memories,  until 
its  full  memory  capacity  is  utilized. 

2.  Global  Analysis,  Parallel  Computation,  and  Content  Addressable  Memory 
[Articles  17  and  18] 

An  important  problem  in  parallel  computation,  control  theory,  and  content  addressable 
memory  is  to  construct  dynamical  systems  which  converge  to  a  prescribed  set  of  equilibria  or 
oscillations,  and  to  only  these  dynamical  modes.  The  former  problem  includes  the  problem 
of  designing  a  global  CAM  as  a  special  case. 

In  this  work,  two  new  methods  for  constructing  systems  of  ordinary  differential  equa¬ 
tions  realizing  any  fixed  finite  set  of  equilibria  in  any  fixed  finite  dimension  are  introduced; 
no  spurious  equilibria  are  possible  for  either  method.  By  using  the  first  method,  one  can 
construct  a  system  with  the  fewest  number  of  equilibria,  given  a  fixed  set  of  attractors. 

Using  a  strict  Lyapunov  function  for  each  of  these  differential  equations,  a  large  class  of 
systems  with  the  same  set  of  equilibria  is  constructed.  A  method  of  fitting  these  nonlinear 
systems  to  trajectories  is  proposed.  In  addition,  a  general  method  which  will  produce  an 
arbitrary  number  of  periodic  orbits  of  shapes  of  arbitrary  complexity  is  also  discussed. 

A  more  general  second  method  is  given  to  construct  a  differential  equation  which  con¬ 
verges  to  a  fixed  given  finite  set  of  equilibria.  This  technique  is  much  more  general  in  that  it 

il1r-r  *V!"  b . r  *  larp/' 

ine  Morse  inequalities,  it  is  dear  tnat  this  class  is  not  universal,  because  there  is  a  large 
class  of  additional  vector  fields  with  convergent  dynamics  which  cannot  be  constructed  by 
the  above  method. 

The  easiest  way  to  see  this  is  to  enumerate  the  set  of  Morse  indices  which  can  be  obtained 
by  the  above  method  and  to  compare  this  class  with  the  class  of  Morse  indices  of  arbitrary 
differential  equations  with  convergent  dynamics.  The  former  set  of  indices  are  a  proper 
subclass  of  the  latter;  therefore,  the  above  construction  cannot  be  universal.  In  general,  it 
is  a  difficult  open  problem  to  construct  a  specific  example  of  a  differential  equation  with  a 
given  fixed  set  of  equilibria,  permissible  Morse  indices,  and  permissible  connections  between 
stable  and  unstable  manifolds. 

A  strict  Lyapunov  function  is  given  for  this  second  case  as  well.  This  strict  Lyapunov 
function  as  above  enables  construction  of  a  large  class  of  examples  consistent  with  these 
more  complicated  dynamics  and  indices.  The  determination  of  all  the  basins  of  attraction 
in  the  general  case  for  these  systems  remains  to  be  accomplished. 

In  particular,  a  simple  feedback  system  with  an  elementary  coupling  rule  has  been  con¬ 
structed  which  generates  all  the  possible  qualitative  dynamics  for  a  convergent  differential 
equation;  i.e.,  generates  all  the  possible  Morse  Indices  which  a  system  with  convergent  dy¬ 
namics  produces.  Strict  Lyapunov  functions  have  been  obtained  for  these  systems  as  well, 
so  that  very  large  families  of  systems  with  similar  dynamics  can  be  constructed. 
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Finally,  a  means  to  approximate  arbitrary  sets  of  nested  periodic  orbits  with  a  specified 
set  of  Morse  Indices  has  been  discussed.  It  is  shown  that  these  orbits  can  be  approximated 
to  any  precision  in  that  finite  sets  of  data  on  each  of  these  orbits  can  be  exactly  fit. 

This  has  many  applications.  In  particular,  the  work  can  be  used  to  construct  a  theory  of 
“nonlinear”  regression  for  differential  equations  whereby  nonlinear  dynamics  curve  fits  can 
be  specified  and  an  underlying  law  which  fits  the  data  can  be  specified. 

Such  a  theory  can  be  used  to  produce  a  general  analog  design  of  digital  components.  It 
indicates  a  method  whereby  general  analog  hardware  can  be  used  to  do  digital  transductions. 

Preliminary  results  indicate  that  Grossberg’s  (1980)  Adaptation  Level  Theorem  can  be 
extended  to  a  system  with  multiple  state-dependent  adaptation  levels.  It  is  noteworthy  that 
the  class  of  systems  which  are  treated  by  this  system  include  a  subclass  of  the  classical 
Lure-Postnikov  and  Popov  systems  studied  in  optimal  control  theory. 

This  theory  promises  to  extend  the  domain  to  which  the  stability  theory  of  classical 
control  systems  can  be  applied. 

3.  Temporal  Prediction,  Reinforcement  Learning,  and  Autonomous  Credit  As¬ 
signment  [Articles  1  and  22] 

An  important  problem  in  the  real-time  problem  solving  of  a  human  operator  or  machine 
concerns  the  proper  temporal  scheduling  of  actions  so  that  they  occur  when  they  are  needed, 
and  not  at  inopportune  times.  The  present  work  develops  a  new  model  of  temporal  prediction 
that  is  based  upon  an  analysis  of  how  animals  and  humans  learn  to  time  their  actions  to 
achieve  desired  goals.  The  problem  that  is  being  solved  may  be  summarized  as  follows. 

Many  goal  objects  may  be  delayed  subsequent  to  the  actions  that  elicit  them,  or  the 
environmental  events  that  signal  their  subsequent  arrival.  Humans  and  many  animal  species 
can  learn  to  wait  for  the  anticipated  arrival  of  a  delayed  goal  object,  even  though  its  time 
of  occurrence  can  vary  from  situation  to  situation.  Such  behavioral  timing  is  important  in 
the  lives  of  animals  which  can  explore  their  environments  for  novel  sources  of  gratification. 

For  example,  if  an  animal  could  not  inhibit  its  exploratory  behavior,  then  it  could  starve  to 
death  by  restlessly  moving  from  place  to  place,  unable  to  remain  in  one  place  long  enough 
to  obtain  food  there.  On  the  other  hand,  if  an  animal  inhibited  is  exploratory  behavior  for 
too  long,  waiting  for  an  expected  source  of  food  to  materialize,  then  it  could  starve  to  death 
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Thus  the  animal’s  task  is  to  accurately  time  the  expected  delay  of  a  goal  object  based  upon 
its  previous  experiences  in  a  given  situation.  It  needs  to  balance  between  its  exploratory 
behavior  aimed  at  searching  for  novel  sources  of  reward,  and  its  consummatory  behavior 
aimed  at  acquiring  expected  sources  of  reward.  To  effectively  control  this  balance,  the  animal 
or  machine  needs  to  be  able  to  suppress  its  exploratory  behavior  and  focus  its  attention 
upon  an  expected  source  of  reward  at  around  the  time  that  the  expected  delay  transpires 
for  acquiring  the  reward. 

This  type  of  timing  calibrates  the  delay  of  a  single  behavioral  act,  rather  than  organizing  a 
correctly  timed  and  speed-controlled  sequence  of  acts.  Suppose,  for  example,  that  an  animal 
typically  receives  food  from  a  food  magazine  two  seconds  after  pushing  a  lever,  and  that 
the  animal  orients  to  the  food  magazine  right  after  pushing  the  lever.  When  the  animal 
inspects  the  food  magazine,  it  perceives  the  nonoccurrence  of  food  during  the  subsequent 
two  seconds.  These  nonoccurrences  disconfirm  the  animal’s  sensory  expectation  that  food 
will  appear  in  the  magazine.  Because  the  perceptual  processing  cycle  that  processes  this 
sensory  information  occurs  at  a  much  faster  rate  than  two  seconds,  it  can  compute  this 
sensory  disconfirmation  many  times  before  the  two  second  delay  has  elapsed. 

The  central  issue  is:  What  spares  the  animal  from  erroneously  reacting  to  these  expected 
nonoccurrences  of  food  during  the  first  two  seconds  as  predictive  failures?  Why  does  the  ani¬ 
mal  not  immediately  become  frustrated  by  the  nonoccurrence  of  food  and  release  exploratory 
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behavior  aimed  at  finding  food  somewhere  else?  If  the  animal  does  wait,  but  food  does  not 
appear  after  the  two  seconds  have  elapsed,  why  does  the  animal  then  react  to  the  unexpected 
nonoccurrence  of  food  by  becoming  frustrated  and  releasing  exploratory  behavior? 

The  present  model  shows  how  the  timing  mechanism  can  inhibit,  or  gate ,  a  process 
whereby  sensory  mismatches  with  learned  expectations  trigger  the  orienting  and  reinforce¬ 
ment  mechanisms  that  would  otherwise  reset  the  animal’s  attentional  focus,  negatively  rein¬ 
force  its  previous  consummatory  behavior,  and  release  its  exploratory  behavior.  The  process 
of  registering  these  sensory  mismatches  or  matches  is  not  itself  inhibited.  For  example,  if 
the  food  happened  to  appear  earlier  than  expected,  the  animal  could  still  perceive  it  and 
eat.  Thus  the  sensory  matching  process  is  not  inhibited  by  the  timing  mechanism.  Instead, 
the  effects  of  sensory  mismatches  upon  processes  of  memory  reset  and  reinforcement  are 
inhibited. 

This  inhibitory  action  is  assumed  to  be  part  of  a  more  general  competition  that  occurs 
between  the  motivational,  or  arousal,  sources  that  energize  different  types  of  behavior.  The 
posited  inhibitory  action  is  from  the  motivational  sources  of  consummatory  behaviors  to  the 
motivational  sources  of  orienting  and  exploratory  behaviors.  The  consummatory  motiva¬ 
tional  sources  are  also  assumed  to  be  in  mutual  competition,  enabling  only  the  strongest 
combinations  of  sensory,  reinforcing,  and  homeostatic  signals  to  control  observable  behav¬ 
iors.  Thus  the  posited  competition  is  a  special  case  of  the  general  hypothesis  that  the  output 
signals  from  all  motivational  sources  compete  for  the  control  of  observable  behaviors.  This 
competition  regulates  the  decision  process  whereby  potentially  competing  events  can  be 
scheduled  to  conjointly  satisfy  the  constraints  imposed  by  multiple  goals. 

From  a  neurobiological  perspective,  the  results  offer  a  solution  to  several  basic  problems 
about  biological  memory.  The  model  is  called  a  START  model  because  it  shows  how  a  Spec¬ 
tral  Timing  process  can  modulate  ART  mechanisms.  The  combination  of  these  mechanisms 
suggests  how  reinforcement  learning  is  adaptively  timed  and  modulates  the  course  of  recog¬ 
nition  learning,  attention  switching,  memory  search,  selective  forgetting,  and  the  timing  of 
goal-oriented  actions.  The  model  suggests  how  NMDA  receptors  in  the  dentate-CA3  region 
of  the  hippocampus  may  participate  in  learning  to  adaptively  time  reinforcement  learning, 
and  helps  to  explain  the  complex  pattern  of  changes  in  trace  conditioning,  delay  condition¬ 
ing,  and  reversal  conditioning  that  are  caused  by  hippocampal  ablations.  It  is  suggested 
that  conditioning  may  potentiate  coordinated  processes  of  presynaptic  transmitter  produc- 
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oenaviors.  Convergence  ot  dentate  cells  on  CA3  pyramidal  cells  is  suggested  to  create  a 
collective  adaptively  timed  signal  that  no  single  dentate  cell  can  generate  by  itself. 

The  model  suggests  neural  mechanisms  for  several  distinct  types  of  learning:  learning  of 
adaptive  timing;  reinforcement  learning,  including  emotional  conditioning;  incentive  motiva¬ 
tional  learning,  to  help  focus  attention  and  energize  behavioral  responses;  motor  learning  of 
discrete  adaptive  responses;  and  recognition  learning.  In  particular,  the  model  distinguishes 
between  cerebellar  influences  on  motor  learning  and  hippocampal  influences  on  adaptive 
timing  of  reinforcement  learning.  The  model  clarifies  how  damage  to  the  hippocampal  for¬ 
mation  eliminates  attentional  blocking  and  causes  symptoms  of  medial  temporal  amnesia. 
It  suggests  how  normal  acquisition  of  subcortical  emotional  conditioning  can  occur  after 
cortical  ablation,  even  though  extinction  of  emotional  conditioning  is  severely  retarded  by 
cortical  ablation.  The  model  also  clarifies  how  the  anatomical  sites  and  functional  properties 
of  emotional  conditioning  and  conditioning  of  adaptive  timing  differ. 

Model  interactions  between  sustained  and  transient  cells  help  to  explain  how  increasing 
the  duration  of  an  unconditioned  stimulus  increases  the  amplitude  of  emotional  conditioning, 
but  does  not  change  conditioned  timing;  and  how  an  increase  in  the  intensity  of  a  conditioned 
stimulus  “speeds  up  the  clock”,  but  an  increase  in  the  intensity  of  an  unconditioned  stimulus 
does  not.  Computer  simulations  of  the  model  fit  parametric  conditioning  data  from  the 
rabbit  nictitating  membrane  paradigm,  including  a  Weber  law  property,  inverted  U  property, 
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and  anomalous  shift  and  amplification  in  the  adaptively  timed  data  at  large  interstimulus 
intervals  (ISIs).  Both  primary  and  secondary  adaptively  timed  conditioning  are  simulated, 
as  are  data  concerning  conditioning  using  multiple  ISIs,  gradually  or  abruptly  changing 
ISIs,  partial  reinforcement,  and  multiple  stimuli  that  lead  to  time- averaging  of  responses. 
Neurobiologically  testable  predictions  are  made  to  facilitate  further  tests  of  the  model. 

4.  Neural  Dynamics  of  Speech  Filtering  and  Segmentation  [Articles  1  and  20] 

A  different  type  of  temporal  processing  is  being  analyzed  in  our  work  on  early  speech 
filtering  and  coherent  speech  segmentation.  These  results  are  aimed  at  understanding  the 
specialized  filters  that  have  evolved  to  help  disambiguate  coarticulated  sounds,  such  as  vowels 
and  consonants,  under  noisy  and  unreliable  conditions.  At  a  later  processing  stage,  both 
forward-acting  and  backward-acting  contextual  effects  can  disambiguate  noisy  sounds,  via 
a  coherent  segmentation  and  completion  process,  even  in  cases  where  no  obvious  speech 
boundaries  exist. 

A  set  of  sustained  and  transient  detectors  have  been  constructed  which  can  partially 
disambiguate  coarticulated  consonants  and  vowels.  The  sustained  detectors  are  sensitive 
primarily  to  vocalic  phonemic  segments.  The  transient  detectors  are  primarily  sensitive  to 
different  aspects  of  speech  transitions  such  as  the  onset  or  offset  of  consonantal  bursts,  the 
offset  of  consonants,  and  frication  stimi  ii.  These  detectors  can  be  used  to  detect  differing 
phonemic  qualities  and  as  initial  detectors  of  various  consonantal  types  so  as  to  be  able  to 
use  more  specialized  detectors  to  precisely  identify  a  specific  type  of  stop  or  vowel. 

We  are  also  interested  in  formulating  a  general  theory  of  auditory  object  recognition. 
Such  a  theory  would  explain  the  ability  to  segregate  different  auditory  objects  as  part  of  the 
auditory  scene.  One  major  (perhaps  the  dominant  factor)  in  separating  two  auditory  objects 
is  pitch.  We  have  thereby  been  constructing  a  model,  the  Spinet  (Spatial  Pitch  Network) 
model  of  pitch  perception,  which  unifies  and  explains  much  of  the  pitch  data  in  the  literature. 
This  model  is  being  incorporated  in  a  more  general  model  of  auditory  object  recognition. 

Our  work  on  this  project  should  help  to  solve  some  of  the  types  of  radar  and  speech 
processing  problems  where  methods  like  dynamic  programming  fails.  We  have  started  to 
test  these  new  filter  models  on  databases  such  as  TIMIT  and  ISOLET,  and  on  data  such  as 
phonemic  restoration  and  backward  completion  effects. 
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cope  with  the  constraints  of  different  task  domains.  In  this  sense,  all  the  results  contribute 
to  a  general  theory  of  intelligent  information  processing  by  dynamical  systems  that  carry 
out  a  biological  style  of  parallel  computation. 

5.  Measurement  Theory  [Articles  16,  17,  and  18] 

Significant  progress  has  recently  been  made  in  the  study  of  non- associative  concatena¬ 
tion  structures  of  two  variables.  In  recent  work,  Cohen  has  completed  the  classification  of 
general  non-associative  idempotent  operations  begun  by  Cohen  and  Narens  (1979)  and  Luce 
and  Narens  (1985)  using  the  structure  of  the  automorphism  group  of  these  operations.  In  ad¬ 
dition,  weakly  positive  concatenation  structures  have  been  completely  classified  under  very 
mild  solvability  and  unboundedness  constraints.  Such  work  is  important  in  clarifying  the 
nature  of  differing  scale  types  such  as  ordinal,  interval,  and  ratio  scales,  when  it  is  possible 
to  construct  a  subjective  scale. 
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